Serveur d'exploration sur l'Université de Trèves

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Estimating Clustering Indexes in Data Streams

Identifieur interne : 000064 ( LNCS/Analysis ); précédent : 000063; suivant : 000065

Estimating Clustering Indexes in Data Streams

Auteurs : Luciana S. Buriol [Brésil] ; Gereon Frahling [États-Unis] ; Stefano Leonardi [Italie] ; Christian Sohler [Allemagne]

Source :

RBID : ISTEX:485A339ABBCCE2B9488B249981FEEA298729FE1F

Abstract

Abstract: We present random sampling algorithms that with probability at least 1 − δ compute a (1 ±ε)-approximation of the clustering coefficient and of the number of bipartite clique subgraphs of a graph given as an incidence stream of edges. The space used by our algorithm to estimate the clustering coefficient is inversely related to the clustering coefficient of the network itself. The space used by our algorithm to compute the number K 3,3 of bipartite cliques is proportional to the ratio between the number of K 1,3 and K 3,3 in the graph. Since the space complexity depends only on the structure of the input graph and not on the number of nodes, our algorithms scale very well with increasing graph size. Therefore they provide a basic tool to analyze the structure of dense clusters in large graphs and have many applications in the discovery of web communities, the analysis of the structure of large social networks and the probing of frequent patterns in large graphs. We implemented both algorithms and evaluated their performance on networks from different application domains and of different size; The largest instance is a webgraph consisting of more than 135 million nodes and 1 billion edges. Both algorithms compute accurate results in reasonable time on the tested instances.

Url:
DOI: 10.1007/978-3-540-75520-3_55


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

ISTEX:485A339ABBCCE2B9488B249981FEEA298729FE1F

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Estimating Clustering Indexes in Data Streams</title>
<author>
<name sortKey="Buriol, Luciana S" sort="Buriol, Luciana S" uniqKey="Buriol L" first="Luciana S." last="Buriol">Luciana S. Buriol</name>
</author>
<author>
<name sortKey="Frahling, Gereon" sort="Frahling, Gereon" uniqKey="Frahling G" first="Gereon" last="Frahling">Gereon Frahling</name>
</author>
<author>
<name sortKey="Leonardi, Stefano" sort="Leonardi, Stefano" uniqKey="Leonardi S" first="Stefano" last="Leonardi">Stefano Leonardi</name>
</author>
<author>
<name sortKey="Sohler, Christian" sort="Sohler, Christian" uniqKey="Sohler C" first="Christian" last="Sohler">Christian Sohler</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:485A339ABBCCE2B9488B249981FEEA298729FE1F</idno>
<date when="2007" year="2007">2007</date>
<idno type="doi">10.1007/978-3-540-75520-3_55</idno>
<idno type="url">https://api.istex.fr/document/485A339ABBCCE2B9488B249981FEEA298729FE1F/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001B74</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">001B74</idno>
<idno type="wicri:Area/Istex/Curation">001A57</idno>
<idno type="wicri:Area/Istex/Checkpoint">000650</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000650</idno>
<idno type="wicri:doubleKey">0302-9743:2007:Buriol L:estimating:clustering:indexes</idno>
<idno type="wicri:Area/Main/Merge">001464</idno>
<idno type="wicri:Area/Main/Curation">001344</idno>
<idno type="wicri:Area/Main/Exploration">001344</idno>
<idno type="wicri:Area/LNCS/Extraction">000064</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Estimating Clustering Indexes in Data Streams</title>
<author>
<name sortKey="Buriol, Luciana S" sort="Buriol, Luciana S" uniqKey="Buriol L" first="Luciana S." last="Buriol">Luciana S. Buriol</name>
<affiliation wicri:level="4">
<country xml:lang="fr">Brésil</country>
<wicri:regionArea>Federal University of Rio Grande do Sul, Porto Alegre</wicri:regionArea>
<placeName>
<settlement type="city">Porto Alegre</settlement>
<region type="state">Rio Grande do Sul</region>
</placeName>
<orgName type="university">Université fédérale du Rio Grande do Sul</orgName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Brésil</country>
</affiliation>
</author>
<author>
<name sortKey="Frahling, Gereon" sort="Frahling, Gereon" uniqKey="Frahling G" first="Gereon" last="Frahling">Gereon Frahling</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Google Research, New York</wicri:regionArea>
<placeName>
<region type="state">État de New York</region>
</placeName>
</affiliation>
<affiliation></affiliation>
</author>
<author>
<name sortKey="Leonardi, Stefano" sort="Leonardi, Stefano" uniqKey="Leonardi S" first="Stefano" last="Leonardi">Stefano Leonardi</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Italie</country>
<wicri:regionArea>University of Rome “La Sapienza”, Rome</wicri:regionArea>
<placeName>
<settlement type="city">Rome</settlement>
<region nuts="2">Latium</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Italie</country>
</affiliation>
</author>
<author>
<name sortKey="Sohler, Christian" sort="Sohler, Christian" uniqKey="Sohler C" first="Christian" last="Sohler">Christian Sohler</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Heinz Nixdorf Institute and University of Paderborn, Paderborn</wicri:regionArea>
<wicri:noRegion>Paderborn</wicri:noRegion>
<wicri:noRegion>Paderborn</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2007</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">485A339ABBCCE2B9488B249981FEEA298729FE1F</idno>
<idno type="DOI">10.1007/978-3-540-75520-3_55</idno>
<idno type="ChapterID">55</idno>
<idno type="ChapterID">Chap55</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: We present random sampling algorithms that with probability at least 1 − δ compute a (1 ±ε)-approximation of the clustering coefficient and of the number of bipartite clique subgraphs of a graph given as an incidence stream of edges. The space used by our algorithm to estimate the clustering coefficient is inversely related to the clustering coefficient of the network itself. The space used by our algorithm to compute the number K 3,3 of bipartite cliques is proportional to the ratio between the number of K 1,3 and K 3,3 in the graph. Since the space complexity depends only on the structure of the input graph and not on the number of nodes, our algorithms scale very well with increasing graph size. Therefore they provide a basic tool to analyze the structure of dense clusters in large graphs and have many applications in the discovery of web communities, the analysis of the structure of large social networks and the probing of frequent patterns in large graphs. We implemented both algorithms and evaluated their performance on networks from different application domains and of different size; The largest instance is a webgraph consisting of more than 135 million nodes and 1 billion edges. Both algorithms compute accurate results in reasonable time on the tested instances.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Allemagne</li>
<li>Brésil</li>
<li>Italie</li>
<li>États-Unis</li>
</country>
<region>
<li>Latium</li>
<li>Rio Grande do Sul</li>
<li>État de New York</li>
</region>
<settlement>
<li>Porto Alegre</li>
<li>Rome</li>
</settlement>
<orgName>
<li>Université fédérale du Rio Grande do Sul</li>
</orgName>
</list>
<tree>
<country name="Brésil">
<region name="Rio Grande do Sul">
<name sortKey="Buriol, Luciana S" sort="Buriol, Luciana S" uniqKey="Buriol L" first="Luciana S." last="Buriol">Luciana S. Buriol</name>
</region>
<name sortKey="Buriol, Luciana S" sort="Buriol, Luciana S" uniqKey="Buriol L" first="Luciana S." last="Buriol">Luciana S. Buriol</name>
</country>
<country name="États-Unis">
<region name="État de New York">
<name sortKey="Frahling, Gereon" sort="Frahling, Gereon" uniqKey="Frahling G" first="Gereon" last="Frahling">Gereon Frahling</name>
</region>
</country>
<country name="Italie">
<region name="Latium">
<name sortKey="Leonardi, Stefano" sort="Leonardi, Stefano" uniqKey="Leonardi S" first="Stefano" last="Leonardi">Stefano Leonardi</name>
</region>
<name sortKey="Leonardi, Stefano" sort="Leonardi, Stefano" uniqKey="Leonardi S" first="Stefano" last="Leonardi">Stefano Leonardi</name>
</country>
<country name="Allemagne">
<noRegion>
<name sortKey="Sohler, Christian" sort="Sohler, Christian" uniqKey="Sohler C" first="Christian" last="Sohler">Christian Sohler</name>
</noRegion>
<name sortKey="Sohler, Christian" sort="Sohler, Christian" uniqKey="Sohler C" first="Christian" last="Sohler">Christian Sohler</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Rhénanie/explor/UnivTrevesV1/Data/LNCS/Analysis
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000064 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/LNCS/Analysis/biblio.hfd -nk 000064 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Rhénanie
   |area=    UnivTrevesV1
   |flux=    LNCS
   |étape=   Analysis
   |type=    RBID
   |clé=     ISTEX:485A339ABBCCE2B9488B249981FEEA298729FE1F
   |texte=   Estimating Clustering Indexes in Data Streams
}}

Wicri

This area was generated with Dilib version V0.6.31.
Data generation: Sat Jul 22 16:29:01 2017. Site generation: Wed Feb 28 14:55:37 2024